NLP Driven Citation Analysis

نویسنده

  • Rahul Jha
چکیده

This paper summarizes ongoing research in NLP (Natural Language Processing) driven citation analysis and describes experiments and motivating examples of how this work can be used to enhance traditional scientometrics analysis that is based on simply treating citations as a “vote” from the citing paper to cited paper. In particular, we describe our dataset for citation polarity and citation purpose, present experimental results on the automatic detection of these indicators, and demonstrate the use of such annotations for studying research dynamics and scientific summarization. We also look at two complementary problems that show up in NLP driven citation analysis for a specific target paper. The first problem is extracting citation context, the implicit citation sentences that do not contain explicit anchors to the target paper. The second problem is extracting reference scope, the target relevant segment of a complicated citing sentence that cites multiple papers. We show how these tasks can be helpful in improving sentiment analysis and citation based summarization. ∗This research was conducted while the authors were at University of Michigan. 2 Rahul Jha and others

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

NLP-driven citation analysis for scientometrics

This paper summarizes ongoing research in NLP (Natural Language Processing) driven citation analysis and describes experiments and motivating examples of how this work can be used to enhance traditional scientometrics analysis that is based on simply treating citations as a “vote” from the citing paper to cited paper. In particular, we describe our dataset for citation polarity and citation pur...

متن کامل

Advances in Deep Parsing of Scholarly Paper Content

We report on advances in deep linguistic parsing of the full textual content of 8200 papers from the ACL Anthology, a collection of electronically available scientific papers in the fields of Computational Linguistics and Language Technology. We describe how – by incorporating new techniques – we increase both speed and robustness of deep analysis, specifically on long sentences where deep pars...

متن کامل

NLP4NLP: Applying NLP to Scientific Corpora about Written and Spoken Language Processing

Analyzing the evolutions of the trends of a scientific domain in order to provide insights on its states and to establish reliable hypotheses about its future is the problem we address here. We have approached the problem by processing both the metadata and the text contents of the domain publications. Ideally, one would like to be able to automatically synthesize all the information present in...

متن کامل

Characterizing In-Text Citations Using N-Gram Distributions

Introduction This article focuses on a Natural Language Processing (NLP) approach for the analysis of citation functions in scientific papers. Bibliometric studies traditionally rely on citation metadata and count the number of times a publication has been cited. However, some recent studies rely also on full text processing on papers, e.g. (Boyack et al., 2013), (Bertin et al., 2013, 2014). Th...

متن کامل

Purpose and Polarity of Citation: Towards NLP-based Bibliometrics

Bibliometric measures are commonly used to estimate the popularity and the impact of published research. Existing bibliometric measures provide “quantitative” indicators of how good a published paper is. This does not necessarily reflect the “quality” of the work presented in the paper. For example, when hindex is computed for a researcher, all incoming citations are treated equally, ignoring t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015